An Approach for Predicting Related Word for the Hindi Language

نویسندگان

  • Monika Sharma
  • Dinesh Gopalani
  • Meenakshi Tripathi
  • R. M. K. Sinha
  • E. Arisoy
  • T. N. Sainath
چکیده

Without motivation, writing may be a cumbersome process. In this work, a methodology is proposed which will assist user by providing some reference information e.g. related words while composing an article or message. Smart systems with related word prediction have turned out to be extremely prevalent for English language but there is no such big efforts for Hindi language. The main goal of this dissertation work is to provide syntactically and semantically related words based on continuous feature vector representation. Continuous Bag of Words (CBOW) language model is used to get the feature vector representation of each word in training set. Cosine Distance and rule based strategy is used as measurement to find the most related word in context. In a comparative study we reasoned that our method excels in accuracy estimation than existing method. This approach will help Hindi writing in an effective and creative manner.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Sense Disambiguation in Hindi Language Using Hyperspace Analogue to Language and Fuzzy C-Means Clustering

The problem of Word Sense Disambiguation (WSD) can be defined as the task of assigning the most appropriate sense to the polysemous word within a given context. Many supervised, unsupervised and semi-supervised approaches have been devised to deal with this problem, particularly, for the English language. However, this is not the case for Hindi language, where not much work has been done. In th...

متن کامل

Web Based Hindi to Punjabi Machine Translation System

Hindi and Punjabi are closely related languages with lots of similarities in syntax and vocabulary Both Punjabi and Hindi languages have originated from Sanskrit which is one of the oldest language. In terms of speakers, Hindi is third most widely spoken language and Punjabi is twelfth most widely spoken language. Punjabi language is mostly used in the Northern India and in some areas of Pakist...

متن کامل

Word Sense Disambiguation in Bengali applied to Bengali-Hindi Machine Translation

We have developed a word sense disambiguation(WSD) system for Bengali language and applied the system to get correct lexical choice in Bengali-Hindi machine translation. We are not aware of any existing system for Bengali WSD. Since there is no sense annotated Bengali corpus or sufficient amount of parallel corpus for Bengali-Hindi language pair, we had to use an unsupervised approach. We use a...

متن کامل

A Hybrid Model for Urdu Hindi Transliteration

We report in this paper a novel hybrid approach for Urdu to Hindi transliteration that combines finite-state machine (FSM) based techniques with statistical word language model based approach. The output from the FSM is filtered with the word language model to produce the correct Hindi output. The main problem handled is the case of omission of diacritical marks from the input Urdu text. Our sy...

متن کامل

Material Development and English for Academic Purposes Word Lists; a Reductionist Approach

Nagy (1988) states that vocabulary is a prerequisite factor in comprehension. Drawing upon a reductionist approach and having in mind the prospects for material development, this study aimed at creating an English for Academic Purposes Word List (EAPWL). The corpus of this study was compiled from a corpus containing 6479 pages of texts, 2,081,678 million tokens (running words) and 63825 types (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015